Parallel pytest tracing #693

misrasaurabh1 · 2025-08-28T00:53:53Z

PR Type

Enhancement, Bug fix

Description

Add pytest test splitting for parallel tracing
Launch multiple tracer subprocesses in parallel
Aggregate multiple replay test paths
Clean up all generated replay files

Diagram Walkthrough

flowchart LR
  A["CLI args with pytest"] -- "detect test paths" --> B["pytest_split()"]
  B -- "multiple splits" --> C["spawn Popen per split"]
  C -- "wait & collect" --> D["gather replay_test_paths"]
  D -- "optimize" --> E["codeflash --replay-test <paths>"]
  B -- "single/no split" --> F["single subprocess.run"]
  F -- "collect" --> D

File Walkthrough

Relevant files

Enhancement

tracer.py `Parallelize tracer execution for pytest runs` codeflash/tracer.py Introduce pytest-aware splitting and parallel tracing. Spawn multiple tracer subprocesses with split-specific args. Collect multiple replay test paths and pass to optimizer. Ensure cleanup of all replay and trace artifacts.	+89/-40
pytest_parallelization.py `Add pytest test splitting utility` codeflash/tracing/pytest_parallelization.py Add `pytest_split` to parse pytest args and discover tests. Randomize and split tests into balanced groups. Support directories and files; handle edge cases. Return split groups and original test paths.	+83/-0

Signed-off-by: Saurabh Misra <[email protected]>

github-actions · 2025-08-28T00:55:16Z

PR Reviewer Guide 🔍

(Review updated until commit `060a78c`)

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 3 🔵🔵🔵⚪⚪
🧪 No relevant tests
🔒 No security concerns identified
⚡ Recommended focus areas for review Possible Issue When building per-split output paths, the code assumes outfile is not None. If outfile is None, accessing parent/stem/suffix will raise. Validate outfile or guard this branch. outpath = parsed_args.outfile outpath = outpath.parent / f"{outpath.stem}_{i}{outpath.suffix}" args_dict["output"] = str(outpath) API Robustness Using pytest.Parser().parse_known_args on arbitrary CLI args may not be stable across pytest versions and may not capture file_or_dir as expected. Consider a safer way to extract test paths (e.g., filter positional paths that exist) to avoid ImportError/version drift. import pytest parser = pytest.Parser() pytest_args = parser.parse_known_args(arguments) test_paths = getattr(pytest_args, "file_or_dir", None) if not test_paths: return None, None except ImportError: Return Type Consistency The function returns different shapes: list[list[str]] in most cases, but a flat list[str] when splits are unnecessary. This complicates callers and can cause mis-detection of parallelism. Standardize to always return list[list[str]] and use len<=1 as the single-split case. return test_files, test_paths num_splits = min(num_splits, max_possible_splits)

github-actions · 2025-08-28T00:56:10Z

PR Code Suggestions ✨

Latest suggestions up to 060a78c
Explore these optional code suggestions:

Category	Suggestion	Impact
Possible issue	Avoid shared dict mutation in parallel Avoid reusing and mutating a single `args_dict` across parallel processes; this can race and produce mismatched `output`/`result_pickle_file_path` values. Create a fresh `args` per split and pass a pre-serialized JSON string to `Popen`. Also guard `outpath` when `outfile` is `None` to prevent attribute errors. codeflash/tracer.py [109-139] if len(pytest_splits) > 1: processes = [] test_paths_set = set(test_paths) result_pickle_file_paths = [] for i, test_split in enumerate(pytest_splits, start=1): result_pickle_file_path = get_run_tmp_file(Path(f"tracer_results_file_{i}.pkl")) result_pickle_file_paths.append(result_pickle_file_path) - args_dict["result_pickle_file_path"] = str(result_pickle_file_path) + # Build per-process args without mutating shared dict + per_args = dict(args_dict) + per_args["result_pickle_file_path"] = str(result_pickle_file_path) outpath = parsed_args.outfile - outpath = outpath.parent / f"{outpath.stem}_{i}{outpath.suffix}" - args_dict["output"] = str(outpath) + if outpath is not None: + outpath = outpath.parent / f"{outpath.stem}_{i}{outpath.suffix}" + per_args["output"] = str(outpath) + else: + per_args["output"] = None updated_sys_argv = [] for elem in sys.argv: if elem in test_paths_set: updated_sys_argv.extend(test_split) else: updated_sys_argv.append(elem) - args_dict["command"] = " ".join(updated_sys_argv) + per_args["command"] = " ".join(updated_sys_argv) + args_json = json.dumps(per_args) processes.append( subprocess.Popen( [ SAFE_SYS_EXECUTABLE, Path(__file__).parent / "tracing" / "tracing_new_process.py", *updated_sys_argv, - json.dumps(args_dict), + args_json, ], cwd=Path.cwd(), ) ) for process in processes: process.wait() Suggestion importance[1-10]: 7 __ Why: Mutating a single `args_dict` inside the loop can cause incorrect per-process parameters; building a per-process copy is safer and handling `outfile is None` avoids an attribute error. The change is contextually correct and improves robustness, though not a critical bug fix.	Medium
Possible issue	Fix inconsistent return shapes Ensure return types match the annotated tuple: always return a list of lists for splits. Wrap `test_files` in a list when not splitting, and avoid returning `[[]]` which signals "one empty split" and may be misinterpreted upstream. codeflash/tracing/pytest_parallelization.py [49-65] if not test_files: - return [[]], None + return None, None ... if max_possible_splits == 0: - return test_files, test_paths + return [list(test_files)], test_paths Suggestion importance[1-10]: 5 __ Why: The function's annotated return suggests consistent shapes; returning `[list(test_files)]` when not splitting is reasonable. However, changing `if not test_files` to return `(None, None)` alters behavior and may not be strictly necessary, so impact is moderate.	Low
General	Validate per-split replay outputs Handle missing or None `replay_test_file_path` per split to avoid appending `"None"` and failing later. Collect only valid paths and report which splits failed instead of exiting all on one failure. codeflash/tracer.py [140-149] -for result_pickle_file_path in result_pickle_file_paths: +for idx, result_pickle_file_path in enumerate(result_pickle_file_paths, start=1): try: with result_pickle_file_path.open(mode="rb") as f: data = pickle.load(f) - replay_test_paths.append(str(data["replay_test_file_path"])) - except Exception: - console.print("❌ Failed to trace. Exiting...") - sys.exit(1) + rtp = data.get("replay_test_file_path") + if rtp: + replay_test_paths.append(str(rtp)) + else: + console.print(f"⚠️ No replay test produced for split {idx}.") + except Exception as e: + console.print(f"❌ Failed to trace split {idx}: {e}") finally: result_pickle_file_path.unlink(missing_ok=True) Suggestion importance[1-10]: 6 __ Why: Filtering out missing `replay_test_file_path` entries prevents appending `"None"` and provides clearer diagnostics per split. It's accurate and helpful for resilience but not critical.	Low

Previous suggestions

Suggestions up to commit 936fa0a

Category	Suggestion	Impact
Possible issue	Prevent duplicate test arguments After extending `updated_sys_argv` with a split, set `added_paths = True` so you don’t repeatedly inject the same tests on subsequent matches. This prevents duplicated test arguments in each subprocess call. codeflash/tracer.py [119-126] added_paths = False updated_sys_argv = [] for elem in sys.argv: - if elem in test_paths_set: - if not added_paths: - updated_sys_argv.extend(test_split) + if elem in test_paths_set and not added_paths: + updated_sys_argv.extend(test_split) + added_paths = True else: updated_sys_argv.append(elem) Suggestion importance[1-10]: 8 __ Why: Setting `added_paths = True` prevents repeatedly injecting `test_split` into `updated_sys_argv`, fixing duplicate test arguments in subprocess calls.	Medium
General	Rebuild pytest args from unknown_args Instead of iterating over the full `sys.argv`, rebuild the pytest invocation from `unknown_args` to ensure only the intended test paths and flags are included. This simplifies argument handling and avoids mixing tracer flags with pytest flags. codeflash/tracer.py [121-126] -for elem in sys.argv: - if elem in test_paths_set: - if not added_paths: - updated_sys_argv.extend(test_split) - else: - updated_sys_argv.append(elem) +# Rebuild pytest command with split tests +base_pytest = unknown_args[0] +other_args = [arg for arg in unknown_args[1:] if arg not in test_paths_set] +updated_sys_argv = [base_pytest, other_args, test_split] Suggestion importance[1-10]: 4 __ Why: Rebuilding from `unknown_args` could simplify argument handling, but it risks dropping tracer-specific flags and may not cover all edge cases.	Low

github-actions · 2025-09-06T00:10:48Z

Persistent review updated to latest commit 060a78c

wip

936fa0a

Signed-off-by: Saurabh Misra <[email protected]>

github-actions bot added the Review effort 3/5 label Aug 28, 2025

aseembits93 added 5 commits August 28, 2025 19:26

Merge branch 'main' into parallel-pytest-tracing

0e7834e

bugfix

78519ee

debug measure time

d4788b9

sets instead of lists

0dc325a

Merge branch 'main' into parallel-pytest-tracing

060a78c

aseembits93 marked this pull request as ready for review September 6, 2025 00:09

Merge branch 'main' into parallel-pytest-tracing

aab93ef

misrasaurabh1 merged commit 258b523 into main Sep 11, 2025
17 of 19 checks passed

aseembits93 deleted the parallel-pytest-tracing branch September 22, 2025 18:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Parallel pytest tracing #693

Parallel pytest tracing #693

misrasaurabh1 commented Aug 28, 2025 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented Aug 28, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Aug 28, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Sep 6, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Parallel pytest tracing #693

Parallel pytest tracing #693

Conversation

misrasaurabh1 commented Aug 28, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Type

Description

Diagram Walkthrough

File Walkthrough

Uh oh!

github-actions bot commented Aug 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Reviewer Guide 🔍

(Review updated until commit 060a78c)

Uh oh!

github-actions bot commented Aug 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Code Suggestions ✨

Previous suggestions

Uh oh!

github-actions bot commented Sep 6, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

misrasaurabh1 commented Aug 28, 2025 •

edited by github-actions bot

Loading

github-actions bot commented Aug 28, 2025 •

edited

Loading

(Review updated until commit `060a78c`)

github-actions bot commented Aug 28, 2025 •

edited

Loading